Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add muldiv_c and muxadd peepopts #4740

Open
wants to merge 68 commits into
base: main
Choose a base branch
from

Conversation

akashlevy
Copy link
Contributor

@akashlevy akashlevy commented Nov 13, 2024

What are the reasons/motivation for this change?

This PR adds two peep-hole optimizations that improve netlist quality:

  1. muldiv_c: y = (a * b_const) / c_const ===> a * eval(b_const / c_const) if b_const % c_const == 0. This implements basic constant propagation for multipliers and dividers with divisible constants.
  2. muxadd: y = s ? (a + b) : a ===> y = a + (s ? b : 0). This provides useful restructuring for improving logic optimization. s ? b : 0 can be optimized to a bitwise AND in subsequent optimizations.

Explain how this is achieved.

  • passes/pmgen/peepopt_muldiv_c.pmg: muldiv_c peepopt
  • passes/pmgen/peepopt_muxadd.pmg: muxadd peepopt
  • passes/pmgen/peepopt.cc: Add to docs and run peepopts during peepopt pass
  • passes/pmgen/Makefile.inc: Add the new sources

If applicable, please suggest to reviewers how they can test the change.

  • YosysHQ to review source code and provide feedback/edits as necessary
    • muxadd
    • muldiv_c
  • Silimate to resolve feedback/edits
    • muxadd
    • muldiv_c
  • YosysHQ to construct test plan of 15-20 small-medium test cases per peepopt
    • muxadd
    • muldiv_c
  • Silimate to review test plan and sign off
    • muxadd
    • muldiv_c
  • YosysHQ to write test cases according to test plan and add to regression
    • muxadd
    • muldiv_c
  • YosysHQ to provide formal equivalence checking scripts with equiv_opt, miter, or FOSS eqy
  • Silimate to sign off on test case implementation
    • muxadd
    • muldiv_c
  • YosysHQ to merge PR

Copy link
Member

@povik povik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Akash, I looked over the muxadd pattern

passes/pmgen/peepopt_muxadd.pmg Outdated Show resolved Hide resolved
passes/pmgen/peepopt_muxadd.pmg Outdated Show resolved Hide resolved
passes/pmgen/peepopt_muxadd.pmg Outdated Show resolved Hide resolved
passes/pmgen/peepopt_muxadd.pmg Show resolved Hide resolved
passes/pmgen/peepopt_muxadd.pmg Outdated Show resolved Hide resolved
@akashlevy
Copy link
Contributor Author

Thanks for the feedback on muxadd. I'll take a look today

passes/pmgen/pmgen.py Outdated Show resolved Hide resolved
passes/pmgen/peepopt_muxadd.pmg Outdated Show resolved Hide resolved
passes/pmgen/peepopt_muxadd.pmg Outdated Show resolved Hide resolved
bool c_const_signed = div->getParam(ID::B_SIGNED).as_bool();
int b_const_int = b_const.as_int(b_const_signed);
int c_const_int = c_const.as_int(c_const_signed);
int b_const_int_shifted = b_const_int << offset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as_int() conversion may overflow if the operands are too wide. We should catch this case and bail

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think b_const_int << offset can still overflow

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to proceed, a test specifically designed for this condition will help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as you test b_const_int.size() + offset <= 31 you should be fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err, b_const.size() + offset

passes/pmgen/peepopt_muldiv_c.pmg Show resolved Hide resolved
passes/pmgen/peepopt_muldiv_c.pmg Show resolved Hide resolved
}

// Rewire to only keep multiplier
mul->setPort(\B, Const(b_const_int_shifted / c_const_int, b_const_width));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because b_const_int_shifted is shifted up, there's no guarantee it fits into b_const_width

Also this might need special care wrt signedness. B_SIGNED on the multiplier may be false but b_const_int_shifted / c_const_int can be negative due to the divider. I wouldn't be opposed to requiring matching signedness on the divider and multiplier to make the transformation a little easier to reason about

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@povik should I change to: mul->setPort(\B, Const(b_const_int_shifted / c_const_int, b_const_width + offset));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work but I suggest constructing the Const with 32 bits (the constructor default) and then calling compress(/signedness/) to fit to the constant. The reason for this is: unless wreduce is sequenced after peepopt I am not sure any pass would shrink the operand before we bitblast the multiplier

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented here: akashlevy@8b08f81
and
akashlevy@8687e5f

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@povik is this fix acceptable?

@widlarizer
Copy link
Collaborator

I've added an implementation of muldiv_c tests approximately as specified, see comments in the spec doc. Please pull and resolve the failures and feel free to ask for more context

Passing equiv for simplest muxadd case, prevent multiple match/rewiri…
@akashlevy
Copy link
Contributor Author

@povik @georgerennie @widlarizer Can we split each test into a different file so that we can run each test individually? Right now, the script stops every time a test fails, so we can't easily tell how many tests are passing/failing.

Additionally, now that we have a (somewhat) workable pass, it would be great if you can analyze any test failures and ensure that the test is not the problem.

@widlarizer
Copy link
Collaborator

It's a little bit more convenient to keep it in one file for long term maintanance. These are regression tests so they're meant to all pass in sequence and stopping on the first failure is fine. For development you could copy and chop them up and create a temporary directory with them that isn't tracked or is temporarily tracked under version control

@widlarizer
Copy link
Collaborator

The only failing case I see is a missed optimization: (a*(-6))/3 -> a*(-2) which would be valid according to my testing. I think this peepopt should handle this case as well

@alaindargelas
Copy link

@widlarizer I made a fix here: akashlevy#19, but I also have to increase one constant to actually overflow. Since we are looking at the actual value, not the declared size: 4'd4 only takes 3 bits, 4'd8 takes 4 bits and indeed overflow and get caught.

@akashlevy
Copy link
Contributor Author

@widlarizer @povik @georgerennie I just rebased. All of the tests appear to be passing, and I think we have addressed all your review comments. Can you confirm that everything looks ok on your end now?

@alaindargelas
Copy link

alaindargelas commented Dec 21, 2024

Note:
For test tests/arch/quicklogic/qlf_k6n10f/dsp.ys
The muxadd pass prevents Quicklogic inference to map to a QL_DSP2_MULTACC.
Once transformed by this peepopt, the particular inference pattern probably does not match anymore the structure.
PR: akashlevy#22
Since I'm the one who triggered the last PR, it needs approval to run the actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants